初学flink wordCount 报错

程序中代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
DataSet<Tuple2<String, Integer>> wordCountDataSet =
inputDataSet.flatMap(new MyFlatMapper())
.groupBy(0)
.sum(1);
// 打印输出
wordCountDataSet.print();
}
public static class MyFlatMapper implements FlatMapFunction<String, Tuple2<String,
Integer>> {
public void flatMap(String value, Collector<Tuple2<String, Integer>> out) throws
Exception {
String[] words = value.split(" ");
for (String word : words) {
out.collect(new Tuple2<String, Integer>(word, 1));
}
}
}

然后就想把上边这段代码改写成lambda结构

1
2
3
4
5
6
7
8
DataSet<Tuple2<String,Integer>> resultSet=inputDataSet.flatMap((String value, Collector<Tuple2<String, Integer>> collector) -> {
String[] words=value.split(" ");
for (String word:words
) {
collector.collect(new Tuple2<>(word,1));
}
}).returns(Types.TUPLE(Types.STRING,Types.INT)).groupBy(0).sum(1);

就报了下面这个错误

1
2
3
4
5
6
7
8
9
10
11
12
Exception in thread "main" org.apache.flink.api.common.functions.InvalidTypesException: The return type of function 'main(WordCount.java:21)' could not be determined automatically, due to type erasure. You can give type information hints by using the returns(...) method on the result of the transformation call, or by letting your function implement the 'ResultTypeQueryable' interface.
at org.apache.flink.api.java.DataSet.getType(DataSet.java:178)
at org.apache.flink.api.java.DataSet.groupBy(DataSet.java:701)
at cn.xiaojia521.wc.WordCount.main(WordCount.java:27)
Caused by: org.apache.flink.api.common.functions.InvalidTypesException: The generic type parameters of 'Collector' are missing. In many cases lambda methods don't provide enough information for automatic type extraction when Java generics are involved. An easy workaround is to use an (anonymous) class instead that implements the 'org.apache.flink.api.common.functions.FlatMapFunction' interface. Otherwise the type has to be specified explicitly using type information.
at org.apache.flink.api.java.typeutils.TypeExtractionUtils.validateLambdaType(TypeExtractionUtils.java:350)
at org.apache.flink.api.java.typeutils.TypeExtractionUtils.extractTypeFromLambda(TypeExtractionUtils.java:176)
at org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:571)
at org.apache.flink.api.java.typeutils.TypeExtractor.getFlatMapReturnTypes(TypeExtractor.java:196)
at org.apache.flink.api.java.DataSet.flatMap(DataSet.java:266)
at cn.xiaojia521.wc.WordCount.main(WordCount.java:21)

解决方法:

https://stackoverflow.com/questions/50945509/apache-flink-return-type-of-function-could-not-be-determined-automatically-due

.returns(Types.TUPLE(Types.STRING, Types.INT)) // 如果这里想用函数式接口的lambda表达式的话,需要明确泛型返回的类型

1
2
3
4
5
6
7
8
DataSet<Tuple2<String,Integer>> resultSet=inputDataSet.flatMap((String value, Collector<Tuple2<String, Integer>> collector) -> {
String[] words=value.split(" ");
for (String word:words
) {
collector.collect(new Tuple2<>(word,1));
}
}).returns(Types.TUPLE(Types.STRING,Types.INT)).groupBy(0).sum(1);